Locality-enhancing loop transformations for parallel tree traversal algorithms
نویسندگان
چکیده
Exploiting locality is critical to achieving good performance. For regular programs, which operate on dense arrays and matrices, techniques such as loop interchange and tiling have long been known to improve locality and deliver improved performance. However, there has been relatively little work investigating similar locality-improving transformations for irregular programs that operate on trees or graphs. Often, it is not even clear that such transformations are possible. In this paper, we discuss two transformations that can be applied to irregular programs that perform graph traversals. We show that these transformations can be seen as analogs of the popular regular transformations of loop interchange and tiling. We demonstrate the utility of these transformations on two tree traversal algorithms, the Barnes-Hut algorithm and raytracing, achieving speedups of up to 251% over the baseline implementation.
منابع مشابه
Lecture Notes on Loop Transformations for Cache Optimization 15-411: Compiler Design
In this lecture we consider loop transformations that can be used for cache optimization. The transformations can improve cache locality of the loop traversal or enable other optimizations that have been impossible before due to bad data dependencies. Those loop transformations can be used in a very flexible way and are used repeatedly until the loop dependencies are well aligned with the memor...
متن کاملOil and Water can mix! Experiences with integrating Polyhedral and AST-based Transformations
The polyhedral model is an algebraic framework for affine program representations and transformations for enhancing locality and parallelism. Compared with traditional AST-based transformation frameworks, the polyhedral model can easily handle imperfectly nested loops and complex data dependences within and across loop nests in a unified framework. On the other hand, AST-based transformation fr...
متن کاملA Matrix-Based Approach to Global Locality Optimization
Global locality optimization is a technique for improving the cache performance of a sequence of loop nests through a combination of loop and data layout transformations. Pure loop transformations are restricted by data dependences and may not be very successful in optimizing imperfectly nested loops and explicitly parallelized programs. Although pure data transformations are not constrained by...
متن کاملProgram Transformation for Locality Using Affinity Regions
AAnity regions ensure that a shared processor schedule, mapping loop iterations to processors, is used in consecutive parallel loop nests. Using aanity regions can improve locality without aaecting par-allelism. Unlike loop fusion, aanity regions are always safe. Also, unlike parallel regions, aanity regions do not require explicit code for mapping loop iterations to processors. While aanity re...
متن کاملAn improved algorithm to reconstruct a binary tree from its inorder and postorder traversals
It is well-known that, given inorder traversal along with one of the preorder or postorder traversals of a binary tree, the tree can be determined uniquely. Several algorithms have been proposed to reconstruct a binary tree from its inorder and preorder traversals. There is one study to reconstruct a binary tree from its inorder and postorder traversals, and this algorithm takes running time of...
متن کامل